Fix server rotation resource waste with unified sequential retry logic#37
Fix server rotation resource waste with unified sequential retry logic#37
Conversation
…network errors Co-authored-by: Sunwuyuan <88357633+Sunwuyuan@users.noreply.github.com>
Deploying classworks with
|
| Latest commit: |
17f7b46
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://d7ebca51.classworks.pages.dev |
| Branch Preview URL: | https://copilot-fix-get-effective-se.classworks.pages.dev |
Co-authored-by: Sunwuyuan <88357633+Sunwuyuan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR optimizes the server rotation mechanism to reduce unnecessary requests by using a "primary server first" strategy. Instead of attempting rotation on every request, the system now uses the last successful server and only falls back to rotation when network errors occur.
Changes:
- Introduced
tryWithPrimaryServer()function that attempts the primary (last working) server first, rotating only on network errors - Added
isNetworkError()helper to distinguish between network errors (requiring rotation) and application errors (404, 401, etc.) - Updated all API clients (kvServerProvider.js, api.js) to use the new optimized approach while preserving full rotation for connection testing
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/utils/serverRotation.js | Core optimization logic: added primaryServerUrl tracking, isNetworkError() detection, and tryWithPrimaryServer() function |
| src/utils/providers/kvServerProvider.js | Updated all methods (loadData, saveData, loadKeys, loadNamespaceInfo, updateNamespaceInfo) to use tryWithPrimaryServer |
| src/utils/api.js | Updated getNamespaceInfo to use tryWithPrimaryServer |
| src/utils/socketClient.js | Removed unused tryWithRotation import |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/utils/serverRotation.js
Outdated
| // Network errors from axios typically have no response or specific error codes | ||
| if (!error.response) { | ||
| return true; // No response = network issue | ||
| } | ||
|
|
||
| // Server timeout or connection errors | ||
| if (error.code === 'ECONNABORTED' || error.code === 'ETIMEDOUT' || | ||
| error.code === 'ENOTFOUND' || error.code === 'ECONNREFUSED') { | ||
| return true; | ||
| } | ||
|
|
||
| // 5xx errors might indicate server issues worth retrying | ||
| const status = error.response?.status; | ||
| if (status >= 500) { | ||
| return true; | ||
| } | ||
|
|
There was a problem hiding this comment.
The isNetworkError function has a logical issue. When an error has a response (line 124-126 check fails), but has one of the error codes like 'ECONNABORTED' or 'ETIMEDOUT', the function will still check for these codes on line 129-131. However, these error codes are typically set when there is no response object. If error.response exists, these codes are unlikely to be present, making this second check unreachable or ineffective.
Consider restructuring the logic to check error codes before checking for response, or verify that axios sets both error.code and error.response simultaneously for timeout scenarios.
| // Network errors from axios typically have no response or specific error codes | |
| if (!error.response) { | |
| return true; // No response = network issue | |
| } | |
| // Server timeout or connection errors | |
| if (error.code === 'ECONNABORTED' || error.code === 'ETIMEDOUT' || | |
| error.code === 'ENOTFOUND' || error.code === 'ECONNREFUSED') { | |
| return true; | |
| } | |
| // 5xx errors might indicate server issues worth retrying | |
| const status = error.response?.status; | |
| if (status >= 500) { | |
| return true; | |
| } | |
| // Server timeout or connection errors (may or may not have a response) | |
| if ( | |
| error.code === "ECONNABORTED" || | |
| error.code === "ETIMEDOUT" || | |
| error.code === "ENOTFOUND" || | |
| error.code === "ECONNREFUSED" | |
| ) { | |
| return true; | |
| } | |
| // Network errors from axios often have no response at all | |
| if (!error.response) { | |
| return true; // No response = network issue | |
| } | |
| // 5xx errors might indicate server issues worth retrying | |
| const status = error.response?.status; | |
| if (status >= 500) { | |
| return true; | |
| } |
src/utils/serverRotation.js
Outdated
| // Track the current primary server (the one that's currently working) | ||
| let primaryServerUrl = null; |
There was a problem hiding this comment.
The module-level variable primaryServerUrl introduces potential race condition issues in concurrent scenarios. If multiple API requests fail and trigger rotation simultaneously, they could race to update primaryServerUrl, potentially leading to inconsistent state. Additionally, this state is never reset when servers become unavailable, which could cause the application to stick with a failed server even after configuration changes.
Consider using a more robust state management approach, such as storing this in a singleton with proper synchronization, or implementing a mechanism to periodically validate and reset the primary server.
| * Get the effective server URL for the current provider | ||
| * For classworkscloud, returns the first server in the list | ||
| * For classworkscloud, returns the primary server (last known working) or first server in the list | ||
| * For other providers, returns the configured domain |
There was a problem hiding this comment.
The documentation comment states "For classworkscloud, returns the primary server (last known working)" but doesn't mention the important behavior that this primary server persists across application sessions in the module-level state. This could be misleading since developers might expect this to reset on page reload or application restart. The primary server will actually reset to null on page reload, not persist the "last known working" server.
Update the documentation to clarify that the primary server is tracked only during the current application session and resets to the first server in the list on page reload.
src/utils/serverRotation.js
Outdated
|
|
||
| // 5xx errors might indicate server issues worth retrying | ||
| const status = error.response?.status; | ||
| if (status >= 500) { |
There was a problem hiding this comment.
The function only checks for status codes >= 500, but doesn't handle the case where status is exactly 500. While this works correctly (500 >= 500 is true), it would be clearer and more maintainable to use status >= 500 && status < 600 or add a comment explaining that all 5xx errors should trigger rotation. This makes the intent explicit that we're only targeting server errors, not theoretical status codes above 600.
| if (status >= 500) { | |
| if (status >= 500 && status < 600) { |
src/utils/serverRotation.js
Outdated
| // Use full rotation, which will update the primary server if a different one succeeds | ||
| return await tryWithRotation(operation, options); |
There was a problem hiding this comment.
When tryWithPrimaryServer falls back to tryWithRotation on a network error, it will attempt the primary server again as part of the rotation. This means the failing primary server will be tried twice in sequence - once at line 164, and again when tryWithRotation starts iterating through the server list (which includes the primary server). This results in unnecessary duplicate requests to the same failing server.
Consider modifying the rotation logic to skip the already-tried primary server, or track which servers have already been attempted in the current request chain.
| // Use full rotation, which will update the primary server if a different one succeeds | |
| return await tryWithRotation(operation, options); | |
| // Wrap the operation to skip the already-tried primary server during rotation | |
| const wrappedOperation = async (serverUrl) => { | |
| if (serverUrl === primaryUrl) { | |
| // Skip making a duplicate request to the primary server; let rotation move on | |
| throw new Error(`Skipping already-tried primary server during rotation: ${serverUrl}`); | |
| } | |
| return operation(serverUrl); | |
| }; | |
| // Use full rotation, which will update the primary server if a different one succeeds | |
| return await tryWithRotation(wrappedOperation, options); |
|
@copilot 修改,对于主服务器问题只需要当前会话即可,不需要长期存储。不要判断错误类型,只要返回不是正常的返回就继续尝试 |
Co-authored-by: Sunwuyuan <88357633+Sunwuyuan@users.noreply.github.com>
已修改。移除了错误类型判断逻辑,现在任何错误都会继续尝试其他服务器。主服务器跟踪仅在当前会话有效(页面刷新后重置)。 Changes in commit f56c9b5:
|
|
@copilot review代码 |
Code Review Summary已审查代码,整体实现符合要求。以下是当前实现的关键点: ✅ 优点
|
|
@copilot 对于服务器轮询尝试,不要区分主服务器和其他服务器,使用统一的逻辑,从上到下尝试 |
…l requests Co-authored-by: Sunwuyuan <88357633+Sunwuyuan@users.noreply.github.com>
|
此拉取请求已在 Kernyr 上被提及。那里可能有相关详细信息: |
All API requests were attempting full server rotation on every call, causing resource waste. This PR fixes the issue by implementing a clean, unified approach where all requests use the same sequential retry logic.
Changes
Unified server rotation
tryWithRotationdirectly for consistent behaviorUpdated API clients
kvServerProvider.js: All methods (loadData, saveData, loadKeys, loadNamespaceInfo, updateNamespaceInfo) →tryWithRotationapi.js: getNamespaceInfo →tryWithRotationtryWithRotation(unchanged)Implementation
Benefits
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.